Many researchers have provided the mathematical formulation of the curve by assuming some specific distribution. Conventionally, much work has been carried out by assuming normal distribution. Both of the populations (groups), the bi- Generalized Exponential Distribution model is known to be a good model. In this paper we derive a new method of determining the AUC by utilizing the confidence intervals for Scale and location of the two groups and taking weighted average of possible AUC values with different patterns of weights. We also study the behavior of the new estimator by simulation.
Introduction
The ROC (Receiver Operating Characteristic) curve is a key statistical tool used to evaluate the performance of classification tasks, such as distinguishing between diseased and healthy individuals based on diagnostic markers. Originally developed during World War II, ROC analysis is now widely applied in medicine, psychology, finance, and data mining.
The text introduces the Generalized Exponential (GE) distribution, a flexible three-parameter model (shape, scale, and location) useful for modeling lifetime and skewed data. A specific case, the Bi-Generalized-Exponential distribution, models test scores from two independent GE distributions representing healthy and diseased populations.
Using this model, new ROC curves and Area Under the Curve (AUC) formulas are derived where the healthy and diseased groups’ scores follow GE distributions with different parameters. The false positive rate and ROC functions are explicitly formulated.
To account for uncertainty in the location and scale parameters, confidence intervals based on chi-square distributions are constructed, producing multiple parameter estimates. Combining these yields a matrix of 81 AUC estimates, reflecting variability in parameter estimates.
Two methods for pooling these AUC estimates are proposed: a simple average and a fixed weights method based on probability assignments. A numerical example with specified parameters illustrates the calculation of the joint AUC matrix using these methods.
Conclusion
In this paper we have developed a new method of estimating the AUC of bi-generalized exponential distribution ROC model by using interval estimates .We have identified 9 different combinations that can be considered for evaluating the AUC using the Normal probability distribution. The new estimate is based on 81 possible AUC values and combining them as a weighted average with two different weighting schemes. It is shown that for large samples all the methods work equally well. Further the standard error of the estimate steeply decreases as the sample size increases.
References
[1] Green, D.M and Swets, J.A (1966). “Signal Detection theory and Psychophysics”. Wiley, Newyork.
[2] ALEXANDER, G.N. (1962). The use of the gamma distribution in estimating the regulated output from the storage. Trans. Civil Engineering, Institute of Engineers, Australia 4, 29–34.
[3] JACKSON, O.A.Y. (1969). Fitting a gamma or log-normal distribution to fibre-diameter measurements of wool tops. Appl. Statist. 18, 70–75.
[4] VAN KINKEN, J. (1961). A method for inquiring whether the 0 distribution represents the frequency distribution of industrial accident costs. Acturielle Studien 3, 83–92.
[5] MASUYAMA,M.&KUROIWA, Y. (1952). Table for the likelihood solutions of gamma distribution and its medical applications. Rep. Statist. Appl. Res. Un. Japan. Sci. Engrs. 1, 18–23.
[6] Ehtesham Hussain. (2011), “ the ROC Curve Model from Generalized –Exponential distribution”, Pak.j.stat.oper.res.Vol.VII No.2, pp323-330.
[7] MUDHOLKAR, G.S. & SRIVASTAVA, D.K.(1993). Exponentiated Weibull family for analyzing bathtub failure data. IEEE Trans. Reliability 42, 299–302.
[8] GUPTA, R.D. & KUNDU, D. (1997). Exponentiated exponential family: an alternative to gamma and Weibull distribution. Technical report. Dept of Math., Stat. & Comp. Sci., University of New Brusnwick, Saint- John, NB, Canada.
[9] Mudholkar, G.S. Srivastava, D.K. &Freimer. M. (1995). The exponentiated Weibull family: a reanalysis of the bus motor failure data. Technometrics 37, 436–445.
[10] Lloyd, C. J. (1998). Using smoothed receiver operating characteristic curves to summarize and compare diagnostic systems. Journal of the American Statistical Association, 93, 1356-1364.
[11] Faraggi. D and Reiser.B (2002), Estimation of the Area under the ROC curve ,Statistics in Medicine; 21 :3093-3106
[12] Betinec. M (2008), “Testing the difference of the ROC Curves in Bi-exponential Model” Tatra Mountains Mathematical Publications,39,215-223.
[13] R.VishnuVardhan and, Sarma KVS (2010), “ Estimation of the Area under the ROC Curve using confidence intervals of means ” ANU Journal of Physical Sciences 2(1), 29-39.
[14] R.Vishnu Vardhan, SudeshPundir and G.Sameera (2012), “ Estimating of Area Under the ROC Curve Using Exponential and Weibull distributions”, Bonfring International Journal of Data Mining Vol.2,No.2,June.
[15] Prasuna. Ch and Sarma. K.V.S (2013), “Estimating the Area under the ROC curve and confidence intervals using Bi-exponential model”, International Journal of Statistics and Analysis. ISSN 2248-9959 Volume 3, Number 3 (2013), pp. 323-331.
[16] Suresh babu. N and Sarma. K.V.S(2013)” On the Estimation of Area under Binormal ROC curve using Confidence IntervalsforMeans and Variances”